Parsimonious Relevance Models for Multiple Corpora

نویسندگان

  • Edgar Meij
  • Wouter Weerkamp
  • Krisztian Balog
  • Maarten de Rijke
چکیده

We describe a method for applying parsimonious language models to re-estimate the term probabilities assigned by relevance models. We apply our method to six topic sets from test collections in five different genres. Our parsimonious relevance models (i) improve retrieval effectiveness in terms of MAP on all collections, (ii) significantly outperform their non-parsimonious counterparts on most measures, and (iii) have a precision enhancing effect, unlike other blind relevance feedback methods.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

استخراج پیکره‌ موازی از اسناد قابل‌مقایسه برای بهبود کیفیت ترجمه در سیستم‌های ترجمه ماشینی

Data used for training statistical machine translation method are usually prepared from three resources: parallel, non-parallel and comparable text corpora. Parallel corpora are an ideal resource for translation but due to lack of these kinds of texts, non-parallel and comparable corpora are used either for parallel text extraction. Most of existing methods for exploiting comparable corpora loo...

متن کامل

Parsimonious Language Models for a Terabyte of Text

The aims of this paper are twofold. Our first aim is to compare results of the earlier Terabyte tracks to the Million Query track. We submitted a number of runs using different document representations (such as full-text, title-fields, or incoming anchor-texts) to increase pool diversity. The initial results show broad agreement in system rankings over various measures on topic sets judged at b...

متن کامل

The EU Parliament in clouds

In this study parsimonious language models were used to construct word clouds of the proceedings of the European Parliament. Multiple design choices had to be made and are discussed. Important features are stemming during tokenization, including bigrams into the word cloud and multi-lingualism. Also, the original parsimonious language models were extended with an additional term dampening unigr...

متن کامل

Advertising Keyword Suggestion Using Relevance-Based Language Models from Wikipedia Rich Articles

When emerging technologies such as Search Engine Marketing (SEM) face tasks that require human level intelligence, it is inevitable to use the knowledge repositories to endow the machine with the breadth of knowledge available to humans. Keyword suggestion for search engine advertising is an important problem for sponsored search and SEM that requires a goldmine repository of knowledge. A recen...

متن کامل

Temporal Discourse Models For Narrative Structure

Getting a machine to understand human narratives has been a classic challenge for NLP and AI. This paper proposes a new representation for the temporal structure of narratives. The representation is parsimonious, using temporal relations as surrogates for discourse relations. The narrative models, called Temporal Discourse Models, are treestructured, where nodes include abstract events interpre...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008